Broadcast News Story Segmentation Using Conditional Random Fields and Multimodal Features
نویسندگان
چکیده
This paper proposes to integrate multi-modal features using conditional random fields (CRF) for broadcast news story segmentation. We study story boundary cues from lexical, audio and video modalities, where lexical features consist of lexical similarity, chain strength and overall cohesiveness, acoustic features involve pause duration, pitch, speaker change and audio event type, and visual features contain shot boundary, anchor face and news title caption. These features are extracted in a sequence of boundary candidate positions in the broadcast news. A linear-chain CRF is used to detect each candidate as a boundary/nonboundary tags based on the multi-modal features. Important inter-label relations and contextual feature information are effectively captured by CRF’s sequential learning framework. Story segmentation experiments show that the CRF approach outperforms other popular classifiers, including decision tree (DT), Bayesian network (BN), naive Bayesian classifier (NB), multi-layer perception (MLP), support vector machines (SVM) and maximum entropy (ME) classifier. key words: Story Segmentation, Conditional Random Fields
منابع مشابه
Modeling Broadcast News Prosody Using Conditional Random Fields for Story Segmentation
This paper proposes to model broadcast news prosody using conditional random fields (CRF) for news story segmentation. Broadcast news has both editorial prosody and speech prosody that convey essential structural information for story segmentation. Hence we extract prosodic features, including pause duration, pitch, intensity, rapidity, speaker change and music, for a sequence of boundary candi...
متن کاملBroadcast News Story Boundary Detection Using Visual, Audio and Text Features
News video story segmentation is vital for video summarization, story linking, and curation. We present a multimodal segmentation algorithm which fuses video, audio and text cues for story boundary detection. We show that broadcast news closed captioning is a rich and readily available source that improves story boundary detection. Furthermore, we propose an empirical distribution-based feature...
متن کاملFeature Selection for Trainable Multilingual Broadcast News Segmentation
Indexing and retrieving broadcast news stories within a large collection requires automatic detection of story boundaries. This video news story segmentation can use a wide range of audio, language, video, and image features. In this paper, we investigate the correlation between automatically-derived multimodal features and story boundaries in seven different broadcast news sources in three lan...
متن کاملStory Segmentation of Broadcast News in English, Mandarin and Arabic
In this paper, we present results from a Broadcast News story segmentation system developed for the SRI NIGHTINGALE system operating on English, Arabic and Mandarin news shows to provide input to subsequent question-answering processes. Using a rule-induction algorithm with automatically extracted acoustic and lexical features, we report success rates that are competitive with state-ofthe-art s...
متن کاملDiscovery and fusion of salient multimodal features toward news story segmentation
In this paper, we present our new results in news video story segmentation and classification in the context of TRECVID video retrieval benchmarking event 2003. We applied and extended the Maximum Entropy statistical model to effectively fuse diverse features from multiple levels and modalities, including visual, audio, and text. We have included various features such as motion, face, music/spe...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- IEICE Transactions
دوره 95-D شماره
صفحات -
تاریخ انتشار 2012